BIOTECHNOLOGY Sfi ? -S 



RAW SEQUENCE LISTING 
ERROR REPORT 




The Biotechnology Systems Branch of the Scientific and Technical Information 
Center (STIC) detected errors when processing the following computer readable 
form: 



Application Serial Number: Z°/^^¥/ ^ 
Source: _ Ipj-f c . 

Date Processed by«STIC: ' ^L// f //O t f 




THE ATTACHED PRINTOUT EXPLAINS DETECTED ERRORS. 

PLEASE FORWARD THIS INFORMATION TO THE APPLICANT BY EITHER: 

1) INCLUDING A COPY OF THIS PRINTOUT IN YOUR NEXT COMMUNICATION TO THE 
APPLICANT, WITH A NOTICE TO COMPLY or, 

2) TELEPHONING APPLICANT AND FAXING A COPY OF THIS PRINTOUT, WITH A 
NOTICE TO COMPLY 

FOR CRF SUBMISSION AND PATENTIN SOFTWARE QUESTIONS, PLEASE CONTACT 
MARK SPENCER, TELEPHONE: 703-308-4212; FAX: 703-308-4221 
Effective 12/13/03 : TELEPHONE: 571-272-2510; FAX: 571-273-0221 



TO REDUCE ERRORED SEQUENCE LISTINGS,. PLEASE USE THE CHECKER 
VERSION 4.1 PROGRAM , ACCESSIBLE THROUGH THE U.S. PATENT AND 
TRADEMARK OFFICE WEBSITE SEE BELOW FOR ADDRESS: 

httD://www.usnto.gov/web/offices/pac/checker/chkr41note.htm 



Applicants submitting genetic sequence information electronically on diskette or CD-Rom should be aware that there is 

a possibility that the disk/CD-Rom may have been affected by treatment gj ven to a ll incoming mail. 

Please consider using alternate meUiods of submission for Uf^isk/Cp-Rom or replacement disk/CD-Roiii. 

Anv reolv including a sequence listing in electronic form^hoold NOT be sent to the 20231 zip code address for the 

United States Patent and Trademark Office and instead should be sent via the following to the indicated addresses: 

1. EFS-Bio «httn:/Avww.usDto.gov/ebc/efs/downloads/documents.htm> , EFS Submission 

User Manual - ePAVE) 

2. U.S. Postal Service: Commissioner for Patents, P.O. Box 1450, Alexandria, VA 223 13-1450 

3. Hand Carrv directly to (EFFECTIVE 12/01/03): 

U.S. Patent and Trademark Office, Box Sequence, Customer Window, Lobby, Room IB03, Crystal Plaza Two. 
201 1 South Clark Place, Arlington, V A 22202 

4. Federal Express, United Parcel Service, oe other delivery service to: U.S. Patent and Trademark Office, 
Box Sequence, Room 4B03-Mailroom, Crystal Plaza Two, 201 1 South Clark Pfocc, Arlington, VA 22202 



Revised 10/0S/03 



Raw Sequence Listing Error Summary 



ERROR DETECTED 



SUGGESTED CORRECTION 



SERIAL NUMBER 



ATTN: NEW RULES CASES: PLEASE DISREGARD ENGLISH "ALPHA" HEADERS, WHICH WERE INSERTED BY PTO SOFTWARE 

1 t/ Wrapped Nucleics The number/text at the end of each line "wrapped" down to the next line. This may occur if your file 
Wrapped Aminos was retrieved in a word processor after creating it. Please adjust your right margin to .3; this will 
prevent "wrapping." 

2 Invalid Line Length The rules require that a line not exceed 72 characters in length. This includes white spaces. 

3 l/ Misaligned Amino The numbering under each 5 th amino acid is misaligned. Do not use tab codes between numbers; 



Numbering 
4 Non-ASCII 

5 Variable Length 



10 



11 



12 



Patentln 2.0 
" "bug" 



Skipped Sequences 
(OLD RULES) 



use space characters, instead. 

The submitted file was not saved in ASCII(DOS) text, as required by the Sequence Rules. Please 
ensure your subsequent submission is saved in ASCII text. 

Sequence(s) contain n's or Xaa's representing more than one residue. Per Sequence Rules, 

each n or Xaa can only represent a single residue. Please present the maximum number of each 
residue having variable length and indicate in the <220>-<223> section that some may be missing. 

A "bug" in Patentln version 2.0 has caused.the <220>-<223> section to be missing from amino acid 

sequences(s) . Normally, Patentln would automatically generate this section from the 

previously coded nucleic acid sequence. Please manually copy the relevant <220>-<223> section to 
the subsequent amino acid sequence. This applies to the mandatory <220>-<223> sections for 
Artificial or Unknown sequences. 

Sequence(s) missing. If intentional, please insert the following lines for each skipped sequence: 

(2) INFORMATION FOR SEQ ID NO:X: (insert SEQ ID NO where "X" is shown) 

(i) SEQUENCE CHARACTERISTICS:. (Do not .insert any subheadings under this heading) . 

~(xi) SEQUENCE DESCRIPTION:SEQ ID NO:X: (insert SEQ ID NO where "X" rs shown) 
This sequence is intentionally skipped 

Please also adjust the "(ii) NUMBER OF SEQUENCES:" response to include the skipped sequences. 

missing. If intentional, please insert the following lines for each skipped sequence. 



Skipped Sequences Sequence(s) _ 
(NEW RULES) <2 1 0> sequence id number 

<400> sequence id number 

000 



Use of n's or Xaa's 
(NEW RULES) 



.Invalid <213> 
Response 



Useof<220> 



_PatentIn 2.0 
""bug- 



Use of n's and/or Xaa's have been detected in the Sequence Listing. 

Per 1 .823 of Sequence Rules, use of <220>-<223> is MANDATORY if n's or Xaa's are present. 

In <220> to <223> section, please explain location of n or Xaa, and which residue n or Xaa represents. 

Per 1 .823 of Sequence Rules, the only valid <213> responses are: Unknown, Artificial Sequence, or 
scientific name (Genus/species). <220>-<223> section is required when <213> response is Unknown or 
is Artificial Sequence 

Sequence(s) missing the <220> "Feature" and associated numeric identifiers and responses. 

Use of <220> to <223> is MANDATORY if <2)3> "Organism" response is "Artificial Sequence" or 

"Unknown." Please explain source of genetic.material in <220> to <223> section. 

(See "Federal Register," 06/01/1998, Vol. 63, No. 104, pp. 29631-32) (Sec. 1.823 of Sequence Rules) 

Please do not use "Copy to Disk" function of Patentln version 2.0. This causes a corrupted file, 
resulting in missing mandatory numeric identifiers and responses (as indicated on raw sequence 
listing). Instead, please use "File Manager" or any other manual means to copy file to floppy disk. 



1 3 Misuse of n/Xaa "n" can only represent a single nucleotide ; "Xaa" can only represent a single amino acid 



AMC - Biotechnology Systems Branch - 09/09/2003 
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IFWO 



w — > 
w — > 

W— > 
C— > 

c— > 
w~> 

E — > 



RAW SEQUENCE LISTING DATE: 02/17/2004 

PATENT APPLICATION: US/10/664,044 TIME: 12:17:46 

Input Set : A:\PTO.YF.txt 

Output Set: N:\CRF4\02172004\J664044.raw 

2 <110> APPLICANT: Japan Atomic Energy Research Institute 

3 <120> TITLE OF INVENTION: A method for efficiently determining a DNA strand 

4 break 

5 <130> FILE REFERENCE: 030217 

6 <140> CURRENT APPLICATION NUMBER: US/10/664,044 
6 <141> CURRENT FILING DATE: 2003-09-17 
6 <16£> NUMBER OF SEQ ID: 4 
8/<200^ 1 



ERRORED SEQUENCES 

9 <211> LENGTH it2 84 y Q ~* / 

10 <212> TYPE ^PR'T ) DH(\ 

11 <213> ORGANISM: Deinococcus radiodurans, strain KD8301 
W—> 12 <220> FEATURE: ^_ 

13 <223> OTHER INFORMATION ^Ami no acid sequence ^of DNA repair promoting protein, /IJUJ^^^ 

14 PprA, " ~ ' 1 ' 1 — — - — 



of Deinococcus radiodurans, strain KD8301 



' .... i 

E— > 16 <400> SEQUENCE: 1" 

E — > 17 atg gca agg get aaa gca aaa gac caa acg gac ggc 
E— > 18 48 

19 



ate tac gee gec^ £Wl 



Met Ala Arg Ala Lys Ala Lys Asp Gin Thr Asp Gly He Tyr Ala Ala , 7~jL jj^asl *Tft&Otr£ M - * 

E -> 20 1 5 < 10 * . JLdu^J^ * 

E — > 21 ttc gac ace ttg atg age acg gcg ggc gtg gac age cag ate gee gee ^^4o - ^^^J 



E— > 22 96 

23 Phe Asp Thr Leu Met Ser Thr Ala Gly Val Asp Ser Gin He Ala Ala 

E— > 24 20 25 30 

E — > 25 etc gee gcg agt gag gee gac gcg ggc acg ctg gac gcg gcg etc acg*^ 

E— > 26 144 « ■ 

27 Leu Ala Ala Ser Glu Ala Asp Ala Gly Thr Leu Asp Ala Ala Leu Thr 

E~> 28 35 . 40 45 

E — > 29 cag tec ttg caa gaa gcg cag ggg cgc tgg ggg c tg ggg ctg cac cac^ J^* 

E— > 30 192. — " 

31 Gin Ser Leu Gin Glu Ala Gin Gly Arg Trp Gly Leu Gly Leu His His 

E— > 32 50 55. 60 

E — > 33 ctg cgc cat gag gcg egg ctg ace gac gac ggc gac ate gaa att ctgj 
E— > 34 240 



35 Leu Arg His Glu Ala Arg Leu Thr Asp Asp Gly Asp He Glu He Leu 



70 



75 



ijeu^-) 



E— > 36 65 

E— > 37 80 ^ 

E — > 38 acc gat ggc cgc ccc age gee cgc gtg age gag ggc ttc gga gca etc 
E— > 39 288 



(k 
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RAW SEQUENCE LISTING DATE: 02/17/2004 

PATENT APPLICATION: US/10/664,044 TIME: 12:17:46 

Input Set : A:\PTO.YF.txt 

Output Set: N:\CRF4\02172004\J664044.raw 







40 


Thr 


Asp 


Gly 


Arg 


Pro 


Ser Ala 


Arg 


Val 


Ser 


Glu 


Gly 


Phe 


Gly 


Ala 


Leu 


E- 


-> 


41 










85 








90 










95 




E- 


-> 


42 


gcg 


cag 


gee 


tac 


gcg 


ccc atg 


cag 


gcg 


etc 


gac 


gaa 


cgc 


ggc 


ctg 


age 


E- 


-> 


43 


336 


































44 


Ala 


Gin 


Ala 


Tyr 


Ala 


Pro Met 


Gin 


Ala 


Leu 


Asp 


Glu 


Arq 


Gly 


Leu 


Ser 


E- 


-> 


45 








100 








105 










110 






E- 


-> 


46 


caa 


too 

w 3 3 


aca 

3 w 3 


aca 

3 w 3 


etc 


aac aaa 

3 3 w 3 *"*3 


aac 

33 w 


tac 


cac 


act 


ccc 


aac 


aac 


tta 

w 3 


ceo 


E- 


-> 


47 


384 


































48 


Gin 


Trp 


Ala 


Ala 


Leu 


Gly Glu 


Gly 


Tyr 


Arg 


Ala 


Pro 


Gly 


Asp 


Leu 


Pro 


E- 


-> 


49 






115 








120 










125 








E- 


-> 


50 


ttg 


oca 


caa 


etc 


aaa 


ata eta 

3 **3 w 3 


ate 


aaa 

3^ 3 


cac 


acc 

3 ww 


cac 

3 w 


aac 


ttc 


aaa 


acc 


E- 


-> 


51 


432 


































52 


Leu 


Ala 


Gin 


Leu 


Lys 


Val Leu 


He 


Glu 


His 


Ala 


Arg 


Asp 


Phe 


Glu 


Thr 


E- 


-> 


53 




130 








135 










140 










E- 


-> 


54 


oac 


w 3 3 


tea 


aca 

3 w 3 


aaa 

333 


coc aac 

W 3 W 3 3 w 


aaa 


acc 


ttt 


caa 


cac 

w 3 w 


ata 

3 w 3 


taa 

w 3 3 


cac 

3 W 


aaa 


E- 


-> 


55 


480 


































56 


Asp 


Trp 


Ser 


Ala 


Gly 


Arg Gly 


Glu 


Thr 


Phe 


Gin 


Arg 


Val 


Trp 


Arg 


Lys 


E- 


-> 


57 


145 










150 








155 












E- 


-> 


58 


160 






























E- 


-> 


59 


crcrc 
3 3 w 


aac 


acc 


eta 


ttt 


ate aaa 

3 ww 3^3 


ata 

3 *-3 


acc 


caa 

w 3 3 


ccc 


aca 

3 W 3 


tec 


acc 


aaa 

3** 3 


acc 


E- 


-> 


60 


528 


































61 


Glv 


Asp 


Thr 


Leu 


Phe 


Val Glu 


Val 


Ala 


Arg 


Pro 


Ala 


Ser 


Ala 


Glu 


Ala 


E- 


-> 


62 










165 








170 










175 




E- 


-> 


63 


ceo 

3 W 3 


etc 


tec 


aac 


act 

3^ w 


acc taa 

3 *— w 33 


gac 


ata 

3 w 3 


ate 


gee 


age 


ate 


aag 


gac 


cgc 


E- 


-> 


64 


576 


































65 


Ala 


Leu 


Ser 


Asp 


Ala 


Ala Trp 


Asp 


Val 


He 


Ala 


Ser 


He 


Lys 


Asp 


Arg 


E- 


-> 


66 








180 








185 










190 






E- 


-> 


67 


gCC 


ttc 


cag 


cgt 


aaa 


eta ata 


cgc 


cgc 


age 


aaa 


aag 


gac 


aaa 
yyy 


atg 


etc 


E- 


-> 


68 


624 


































69 


Ala 


Phe 


Gin 


Arg 


Glu 


Leu Met 


Arg 


Arg 


Ser 


Glu 


Lys 


Asp 


Gly 


Met 


Leu 


E- 


-> 


70 






195 








200 










205 








E- 


-> 


71 


acre 

33 


acc 

3 ^ w 


eta 

v **3 


etc 


aaa 

3 33 


act cac 

3*—*-* 


cac 


acc 

3 ww 


aaa 

33 3 


acc 

3 ww 


aaa 


acc 


aac 


etc 


acc 

3 ww 


E- 


-> 


72 


672 


































73 


Gly 


Ala 


Leu 


Leu 


Gly 


Ala Arg 


His 


Ala 


Gly 


Ala 


Lys 


Ala 


Asn 


Leu 


Ala 


E- 


-> 


74 




210 








215 










220 










E- 


-> 


75 


caa 

3 


eta 


ecc 


oaa 

3 ^ ^ 


aca 

3 w 3 


cac ttc 


acc 


ata 

3 *»3 


caa 


aca 

3 w 3 


ttc 


ata 

3 *-3 


caa 


acc 


etc 


E- 


-> 


76 


720 


































77 


Gin 


Leu 


Pro 


Glu 


Ala 


His Phe 


Thr 


Val 


Gin 


Ala 


Phe 


Val 


Gin 


Thr 


Leu 


E- 


-> 


78 


225 










230 








235 












E- 


-> 


79 


240 






























E- 


-> 


80 


age 


gga 


gee 


gee 


gee 


cgc aac 


gee 


gag 


gag 


tac 


cgc 


gcg 


gee 


ctg 


aaa 


E- 


-> 


81 


768 


































82 


Ser 


Gly 


Ala 


Ala 


Ala 


Arg Asn 


Ala 


Glu 


Glu 


Tyr 


Arg 


Ala 


Ala 


Leu 


Lys 


E- 


-> 


83 










245 








250 










255 




E- 


-> 


84 


acc 


gee 


gee 


get 


gcg 


ctg gag 


gaa 


tac 


cag 


ggc 


gtg 


acc 


acc 


cgc 


caa 


E- 


-> 


85 


816 


































86 


Thr 


Ala 


Ala 


Ala 


Ala 


Leu Glu 


Glu 


Tyr 


Gin 


Gly 


Val 


Thr 


Thr 


Arg 


Gin 


E- 


-> 


87 








260 








265 










270 






E- 


-> 


88 


ctg 


tec 


gaa 


gtg 


ctg 


egg cac 


ggc 


ctg 


cgc 


gag 


age 


tga 









JAM^ 
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RAW SEQUENCE LISTING DATE: 02/17/2004 

PATENT APPLICATION: US/10/664,044 TIME: 12:17:46 

Input Set : A:\PTO.YF.txt 

Output Set: N:\CRF4\02172004\J664044.raw 

E— > 89 855 

E — > 90 Leu Ser Glu Val Leu Arg His Gly Leu Arg Glu Ser Sto jQ^^^^ 

E— > 91 ^ 275 _ 280 285 

E--> 93^2^5 7^210/ 

94 <211> LENGTH: 855 

95 <212> TYPE: DNA 

96 <213> ORGANISM: Deinococcus radiodurans, strain KD8301 
W — > 97 <220> FEATURE: 

98 <223> OTHER INFORMATION: Nucleotide sequence of DNA repair promoting protein, 

99 pprA, of Deinococcus radiodurans, strain KD8301. 
E — > 101 <400> SEQUENCE: 2 

102 atggcaaggg ctaaagcaaa agaccaaacg gacggcatct acgccgcctt cgacaccttg 60 

103 atgagcacgg cgggcgtgga cagccagatc gccgccctcg ccgcgagtga ggccgacgcg 120 

104 ggcacgctgg acgcggcgct cacgcagtcc ttgcaagaag cgcaggggcg ctgggggctg 180 

105 gggctgcacc acctgcgcca tgaggcgcgg ctgaccgacg acggcgacat cgaaattctg 24 0 

106 accgatggcc gccccagcgc ccgcgtgagc gagggcttcg gagcactcgc gcaggcctac 300 

107 gcgcccatgc aggcgctcga cgaacgcggc ctgagccagt gggcggcgct cggcgagggc 360 

108 taccgcgctc ccggcgactt gccgttggcg cagctcaagg tgctgatcga gcacgcccgc 420 

109 gacttcgaaa ccgactggtc ggcggggcgc ggcgaaacct ttcagcgcgt gtggcgcaag 480 

110 ggcgacaccc tgtttgtcga ggtggcccgg cccgcgtccg ccgaggccgc gctctccgac 54 0 

111 gctgcctggg acgtgatcgc cagcatcaag gaccgcgcct tccagcgtga gctgatgcgc 600 

112 cgcagcgaga aggacgggat gctcggcgcc ctgctcgggg ctcgccacgc cggggccaag 660 

113 gccaacctcg cccagctgcc cgaagcgcac ttcaccgtgc aggcgttcgt gcagaccctc 720 

114 agcggagccg ccgcccgcaa cgccgaggag taccgcgcgg ccctgaaaac cgccgccgct 780 

115 gcgctggagg aataccaggg cgtgaccacc cgccaactgt ccgaagtgct gcggcacggc 840 + " , j 
E— > 116 ctgcgcgaga gctga ^^%$Sh4AL /CaJ^ I 

117 855^ ■ ^ * Fy^MXJ 

LENGTH: 35 A***^^**^ 




TYPE: DNA 

ORGANISM: Artificial sequence 
FEATURE: 

OTHER INFORMATION: Sense primer for amplifying pprA gene. 
SEQUENCE: 3 f I \ 

gggcataata aaggccatat ggcaagggct aaagc J> 3^ Ld^X /cA-^* 1 ' ) 



JU) 



-^2/07 



<ZTT> LENGTH: 32 
<212> TYPE: DNA 

<213> ORGANISM: Artificial sequence 
<220> FEATURE: 

<223> OTHER INFORMATION: Antisense primer for amplifying pprA gene. 
<400> SEQUENCE: 4 , • \ 

ttt tggatcc tcagctctcg cgcaggccgt gc 32- [ y^JUL yc^^^ ' 
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RAW SEQUENCE LISTING ERROR SUMMARY DATE: 02/17/2004 

PATENT APPLICATION: US/10/664,044 TIME: 12:17:47 

Input Set : A:\PTO.YF.txt 

Output Set: N:\CRF4\02172004\J664044.raw 

Invalid Line Length: 

The rules require that a line not exceed 72 characters in length. This includes spaces. 

Seq#:2; Line(s) 116 
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VERIFICATION SUMMARY DATE: 02/17/2004 

PATENT APPLICATION: US/10/664,044 TIME: 12:17:47 

Input Set : A:\PTO.YF.txt 

Output Set: N:\CRF4\02172004\J664044.raw 

L:3 M:283 W: Missing Blank Line separator, <120> field identifier 

L:5 M:283 W: Missing Blank Line separator, <130> field identifier 

L:6 M:270 C: Current Application Number differs, Replaced Current Application No 

L:6 M:271 C: Current Filing Date differs, Replaced Current Filing Date 

L:6 M:283 W: Missing Blank Line separator, <160> field identifier 

L:8 M:250 E: Invalid Numeric Identifier, INVALID IDENTIFIER 

L:12 M:283 W: Missing Blank Line separator, <220> field identifier 

L:16 M:282 E: Numeric Field Identifier Missing, <210> is required. 

L:16 M:212 E: (34) Invalid or duplicate Sequence ID Number, SEQUENCE ID NOS:0 differs:l 
L:17 M:330 E: (2) Invalid Amino Acid Designator, NUMBER OF INVALID KEYS: 16 
L:18 M:332 E: (32) Invalid/Missing Amino Acid Numbering, 'SEQ ID:1 
M:332 Repeated in SeqNo=33 



L: 


21 


M:330 


E: 


(2) 


Invalid 


Amino 


Acid 


Designator, 


NUMBER 


OF 


INVALID 


KEYS: 


16 


L: 


25 


M:330 


E: 


(2) 


Invalid 


Amino 


Acid 


Designator, 


NUMBER 


OF 


INVALID 


KEYS: 


16 


L: 


29 


M:330 


E: 


(2) 


Invalid 


Amino 


Acid 


Designator, 


NUMBER 


OF 


INVALID 


KEYS: 


16 


L: 


33 


M:330 


E: 


(2) 


Invalid 


Amino 


Acid 


Designator, 


NUMBER 


OF 


INVALID 


KEYS: 


16 


L: 


38 


M:330 


E: 


(2) 


Invalid 


Amino 


Acid 


Designator, 


NUMBER 


OF 


INVALID 


KEYS: 


16 


L: 


42 


M:330 


E: 


(2) 


Invalid 


Amino 


Acid 


Designator, 


NUMBER 


OF 


INVALID 


KEYS: 


16 


L: 


46 


M:330 


E: 


(2) 


Invalid 


Amino 


Acid 


Designator, 


NUMBER 


OF 


INVALID 


KEYS: 


16 


L: 


50 


M:330 


E: 


(2) 


Invalid 


Amino 


Acid 


Designator, 


NUMBER 


OF 


INVALID 


KEYS: 


16 


L: 


54 


M:330 


E: 


(2) 


Invalid 


Amino 


Acid 


Designator, 


NUMBER 


OF 


INVALID 


KEYS: 


16 


L: 


59 


M:330 


E: 


(2) 


Invalid 


Amino 


Acid 


Designator, 


NUMBER 


OF 


INVALID 


KEYS: 


16 


L: 


63 


M:330 


E: 


(2) 


Invalid 


Amino 


Acid 


Designator, 


NUMBER 


OF 


INVALID 


KEYS: 


16 


L: 


67 


M:330 


E: 


(2) 


Invalid 


Amino 


Acid 


Designator, 


NUMBER 


OF 


INVALID 


KEYS: 


16 


L: 


71 


-M:330 


E: 


(2) 


Invalid 


Amino 


Acid 


Designator, 


NUMBER 


OF 


INVALID 


KEYS: 


16 


L: 


75 


M:330 


E: 


(2) 


Invalid 


Amino 


Acid 


Designator, 


NUMBER 


OF 


INVALID 


KEYS: 


16 


L: 


80 


M:330 


E: 


(2) 


Invalid 


Amino 


Acid 


Designator, 


NUMBER 


OF 


INVALID 


KEYS: 


16 


L: 


84 


M:330 


E: 


(2) 


Invalid 


Amino 


Acid 


Designator, 


NUMBER 


OF 


INVALID 


KEYS: 


16 


L: 


88 


M:330 


E: 


(2) 


Invalid 


Amino 


Acid 


Designator, 


NUMBER 


OF 


INVALID 


KEYS: 


13 


L: 


90 


M:330 


E: 


(2) 


Invalid 


Amino 


Acid 


Designator, 


NUMBER 


OF 


INVALID 


KEYS: 


1 


L: 


91 


M:252 


E: 


No. 


of Seq. 


differs, <211> LENGTH: 


Input: 284 Found: 570 SEQ: 


0 



L:93 M:250 E: Invalid Numeric Identifier, INVALID IDENTIFIER 

L:97 M:283 W: Missing Blank Line separator, <220> field identifier 

L:101 M:212 E: (34) Invalid or duplicate Sequence ID Number, SEQUENCE ID NOS:0 differs: 2 
L:116 M:254 E: No. of Bases conflict, LENGTH: Input : 0 Counted: 855 SEQ: 2 
L:119 M:250 E: Invalid Numeric Identifier, INVALID IDENTIFIER 
L:123 M:283 W: Missing Blank Line separator, <220> field identifier 

L:126 M:212 E: (34) Invalid or duplicate Sequence ID Number, SEQUENCE ID NOS:0 differs: 3 
M:254 Repeated in SeqNo=33 

L:130 M:250 E: Invalid Numeric Identifier, INVALID IDENTIFIER 
L:134 M:283 W: Missing Blank Line separator, <220> field identifier 

L:137 M:212 E: (34) Invalid or duplicate Sequence ID Number, SEQUENCE ID NOS : 0 differs: 4 
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<1 10> 

<1 20> 

<\ 3 0> 

<\<0> 
<l < 1 > 



<i so> 

< 1 5 1 > 



<iCO> 



< 1 70> 



< 2 I 0 > 

< 2 I I > 
<21 2> 
<7 1 3> 



< 2 20> 
<22 I > 

< 2 2 2 > 

<300> 

< 301 > 

< )02 > 

< 303 > 

< 304 > 

< 30S> 

< 30C> 

< 307 > 

< 300> 

< 309> 



<400> 

agggagagtg 
tgat gtggca 
cgcggcgcgg 



Smith, «John; ■• Smtthgcnc Inc. 

> 

example of a icQucncc Listing 
01-JD0001 



rcT/cr9e/ooooi 

1996-12-31 



US 08/999.999 
1997-10-15 



f'atcntln version 2.0 
1 

309 

UNA A 
Paramecium sp. 



COS 

( 279) ... (369) 



Doc. Richard 

Isolation and Character i i a i ion of «> Ccnc encoding o 

Protease from Paramecium sp. 

Journal o( Ccncs 

1 

4 

1 -7 

1908-06-31 ^ 
1 23 4SC * 
1900-OG-31 ' 



I 

actcctgtgr cctcitctci ctgggcitci cacccigcta aicagatctc 

. ■' 

tcttgaccct cctctgcctt tgeage t i ca caggcaggca ggcaggcagc 

attgetggea gtgecacagg cttttcagcc aggcttaggg tgggttccgc 

cggcccctct cgcgctcctc tcgcgcctci ctctcgctct cctctcgctc 
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CO 
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990CCt9*tl *99t9*9C09 9*99*99900 C*9tC#9C ^ 

ttg ccc tec **« t99 ^cct- .99* tc r tto 
Leu . Scr The Lys . Trp ^Cro Cly - Ph 



10 



IS 



23 



30 













9t t 




a 1 9 


etc 


*9C 


Vol 


Scr 


Met 


The 


Scr 












- • 1- 








or t 


tot 


ttg 


t tc 


*- 0 0 


Vol 


Cys 


Leu 


rhc 


Cln 






20 






C69 


CC9 . 




ctx 




Cln 


Pro 


Asn — u - ■• 






' 35 









2*C 



*3<< 



)69 



<210> 
<21 1> 
<21 2> 
<213> 



2 

)"> 
PRV 

r*r*mccium sp. 



<<00> 2 

/let vol Scr net 



Phc Scr Leu Scr Plic Lys Trp Pro Cly PIiq Cys Leu 
5 10 15 



i'hc Vol Cys Leu Phe Cln Cys 
20 



vo Lys Vol Leu Pro Cys His Scr Scr 
2S JO 



Leu Cln Pro Asn Lou 



<2I0> 
<21 1 > 
<21 2> 
< 2 1 3 > 



) 

1 1 

PRT 

Artificial Sequence 



<220> 
< 22 ) > 



<<00> 



Designed peptide hosed on si ic ond polority to oci os 0 
linker between the 0 1 ph«) ond beto choins of Protein KYI 



Met Vol Asn Leu Clu Pro Met His Thr Clu He 
1 *> 10 



<210> 
<<00> 
000 
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shall be used only in : chc Sequence 

pf the items of information in the* 
the arrangement giyen below. Each 
A new line and shall" begin with the 
brackets as # shown The submi s sJL on 
of those items of information designated with^R^y" is mandatory. The 
submission of those items of information designated with an "O" is 
optional,. Numeric identifiers <110> through <1.70> shall only, be set 
forth at the beginning of the "Sequence his ting. - JT^fc following table 
illustrates the numeric identifiers. mm v 



iUCIIkl L X WU" «HU UIC1 1 

tabic. The numeric identifier 
Listing.* 4 The order and presentation 
"Sequence Listing" shall conform to 
item of information shall begin on 
numeric identifier enclosed in- angle 



Numeric 
Identifier 



Definition 



Comments 
Tor mat 



and 



Mandatory (M) 
Optional JO) 



or 



<110> 



Appl leant 



Preferably max.- M ; 

of 10 names; 

one name per line; 

preferable .-format: 

Surname, Other/- „ - v , .. 

Names and/or 

Initials .* v' 



<120> 



Title of 
Invention 



M 



<130> 



Tile Reference 



Personal file 
re f crcncc 



M, when filed prior 
to assignment of 
appl . numbc r 



< I < 0 > 



Cur rent Appl i ca 
tjpn Number 



Spcci fy as : 

US 07/999, 999 or 

PCT/US96/99999 



M, if a va i 1 abl c 



< 1 < 1 > 



Current 
Date 



Tiling Specify as: yyyy-nvn- dd M, if available 



<1 S0> 



Prior Appl icat ion 
Numbc r 



Spcci f y as: 

US 07/999, 999 or 

PCT/US96/99999 



M, if appl i cabl c 
i ncl udc priori c y 
documents under 
3S USC 1 19 and 
120 



<1S1> 



<1C0> 



Prior Application Specify as: yyyy-mm-dd M, if applicable 
Pi 1 i ng Da tc 



Number of SEQ ID Count includes 
NOs total number of 

SEQ 10 NOs 



X 



<170> 



Software . 



Name of software used 
to create the 
Sequence . Listing 



<210> 



SCO 10 NO : tl : 



Response shall be an 
integer repre- 
senting the SEQ 
ID NO shown 



M 



< 2 1 1 > 



Lcng t h 



Respond with an integer M 
expressing the number 
of bases or amino acid 
residues 



<2 1 2> 



Type 



4 



Whcdicf pfCJcntcd 
sequence mole- 
cule is OHA. 
RMA, of PKT 
(protein) . I f 
. * nucleotide 
sequence con- -% T~ \ 
tains both VHK 
And HNA frag- 
ments, the 
V " type shall be 
* "DNA. - In ad- 
Vdition, the 
"combined DNA/ 
ANA molecule 
shall be further 
described in 
the <220> to 
<223> feature 
section. 



M 



<21 3> 



Organism 



Scientific name, 
i.e. Genus/species. 
Unknown or Artifi- 
cial Sequence. In 
addition, the 
"Unknown" or 
"Artificial Se- 
quence" organisms 
shall be further 
described in the 
<220> to <223> 
feature section. 



M 



<220> 



Tea t u re 



Leave blank after 
<220>. <221-223> 
provide for a 
description of 
points of bio- 
logical signi- 
ficance in the 
sequence . 



M, under the 
foil owi ng condi - 
t i ons : if "n, " 
"Xaa. " or a mod- 
i f i ed or unusua 1 
L-omino acid or 
mod i i i ed base was 
used in a se- 
quence ; i f ORGAN - 
ISM is "Artifi- 
cial Sequence" or 
"Unknown" ; i f 
molecule is 
combined DNA/RNA. 



<22 1 > 



Name/Key 



Provide appropriate 
i dent i f i c r for 
feature, pre- 
ferably from 
W1PO Standard 
ST. 25 11998). 
Appendix 2, 
Tables 5 and 6 



M. unde r the f ol - 
lowing conditions 
if "n, " "Xaa . M or 
a mod i f i ed or un- 
usual L- ami no 
acid or mod i f i c.d 
base was used in 
a sequence 



<222> 



Loca t i on 



Spec i f y loca t ion 
within sequence; 
where appropriate 
state numbc r ol 
first and l_a s t 
ba sc s /ami no acids 



M. unde r the fol - 
lowing conditions 
if "n. " "Xaa . M or 
a mod i ( i ed or un- 
usual L- ami no 
acid or mod i f i cd 



<223> 



Other Infor- 
mation 



feature 



Other relevant 

informa ti on; 

four lines maximum 



biic wi'5 uicVJ i rj 
a sequence 

M, under the fol - 
lowing condi tions : 
i f "n, .-. -Xaa^ M Q r- 
a modified or un-* 
usual L-amlno acid 
or modified base 
was used in a 
sequence; if 
ORGANISM 
is -Artificial 
Sequence^ or 
"Unknown-; i-f-=»i ~ 
mol.cculc is com- 
bined DNA/fWAr=* > 



<300> 



<301> 



Publication 
I n forma tion 

Authors 



Leave blanfc* 
after <300> " 

Preferably max 
of ten named 
authors of publi- 
cation; spcci(y 
one name per line; 
pre f crablc format : 
Surname, Other 
Names and/or 
Initials 



O 

v/ O 



< 302 > 

< 303 > 
<30< > 
<30S> 
<300> 
< 307 > 



< 300> 



< 309> 



<3 10> 



Title 

Journal 

Vol umc 

Issue 

Pages 

Date 



Da taba s c 
Acces s i on 
Numbc r 



Database Cntry 
Da tc 



Patent Document 
Numbc r 



Journal date on which 
da t a publ i shed; 
specify as yyyy-mm- 
dd f MMM-yyyy or ^ 
Scason-yyyy ^ 

Accession number 
as s i gncd by da t a - 
base including 
database name 

Da t e of ent r y in 
da t aba s c ; speci ( y 
as yyyy-mm-dd or 
MMM-yyyy 

Document number; 
for patent-type 
citations only. 
Spec i f y as . for 
example, US 
07/999.999 - 



O J 1 > 



<312> 



rncnc riling 

Da Cc 



Document filing 
date, for patent- 
type citations only; 
specify as yyyy-nm-dd 



Publication Date Document publ i ca^t ion^ 
date, for * 
pa tcjit-typc 
citations only; 
specify as yyyy-nvn-d<* 



<313> 



<<00> 



Relevant / 
Residues s, 

Sequence 



VaOM (position) TO 
(position) 

SCO 10 HO should 

follow the 

numeric identifier 
and should appear 
on the line pre- 
ceding the actual* 
sequence 



5. Section 1 . 0 2 < is revised to read as follows: 



1.024 rorm and format for nucleotide and/or amino acid 'sequence 
submissions in computet readable form. 



(a) The computer readable form required by 
fol lowing speci fi cat ions : 



1.02 1(c) sha 1 1 meet the 



(1) The computer readable form shall contain a single "Sequence Listing" 
as cither a diskette, scries of diskettes, or other permissible media 
outlined in paragraph (O of this section. 

(?) The "Sequence Listing".' in paragraph (a) (1) of this section shall be 
submitted in American Standard Code for Information Interchange (ASCII) 
text. No other formats shall be allowed. 

(3) The computer readable form may be created by any means, such as word 
processors, nucleotide/amino acid sequence editors or other custom 
computer programs; however, it shall conform to all specifications 
detailed in this section. 

(<) Tile compression is acceptable when using diskette media, so long as 
the compressed file is in a sc 1 f -c>: 1 1 act i ng forma^that will decompress 
on one of the systems described in paragraph (b) o/( this section. 

(S) Page numbering shall not appear within the computer readable form 
version of the "Sequence Li s t i ng " file. 

(61 All computer- readable forms shall have a label permanently affixed 
thereto on which has been hand-printed or typed: the name of the 
applicant, the title of the invention, the date on which the data were 
recorded on the computer readable form, the operating system used, a 
reference number, and an application serial number and filing date, if 
known. 

(b) Computer readable form submissions must meet these format 
requi rcmcnts : 



(II Computer: I Dm PC/XT/AT. 



or compatibles, or Apple Macintosh; 



I2» Operating System: MS-DOS. Unix or Macintosh; 



l/?«J/»/0 I \ * I'M 



